home *** CD-ROM | disk | FTP | other *** search
- Path: nntp.teleport.com!sschaem
- From: sschaem@teleport.com (Stephan Schaem)
- Newsgroups: comp.sys.amiga.programmer
- Subject: Re: PPC compilers
- Date: 25 Jan 1996 11:47:23 GMT
- Organization: Teleport - Portland's Public Access (503) 220-1016
- Distribution: world
- Message-ID: <4e7qkb$4fo@maureen.teleport.com>
- References: <4d42gg$i2p@ra.ibr.cs.tu-bs.de> <4dov8s$rc5@ar.ar.com.au> <38232132@kone.fipnet.fi>
- NNTP-Posting-Host: kelly.teleport.com
- X-Newsreader: TIN [version 1.2 PL2]
-
- Jyrki Saarinen (jsaarinen@kone.fipnet.fi) wrote:
-
- : > To do fast 16:16 bit fixed point maths (ie for bitmap scaling). You have
- : > the fraction in the high word and the integer part in the low word.
- : >áBecause of the addx, every time the fraction wraps around, the X bit is set
- : >áand next add, the integer bit is incremented.
-
- : Addx is really great .. Hmm. We could think of all the
- : situations where addx could possibly go wrong. (I mean
- : fixed point interpolations)
-
- : At least this has to be done before the loop:
- : moveq #0,d0
- : add.l d1,d0
-
- I dont see what this do... clear x? sub.l d0,d0
-
- : ...
- : .loop addx.l d1,d2
- : dbf d7,.loop
-
- : But what about if there is two addx? Theis decimal parts
- : have to be switched, which is done nicely with eor
- : before swapping, but what about that before loop
- : correction? Of course this perfect accuracy
- : is not required on 16.16 fixed point but if
- : there is a smaller amount of fraction, the
- : errors can be quite big..
-
- you actually to 3 add per inst:
-
- 1) addx.l d2,d0
- 2) addx.l d3,d1
-
- =
-
- 1) aa'[iteration] += aa"[iteration] + aoverflow of (ab' + ab")
- boverflow = bb' + bb"[iteration]
- 2) bb'[iteration] += bb"[iteration] + boverflow of (bb' + bb")
- aoverflow = ab' + ab"[iteration+1]
-
- So, to enter and leave the loop corectly:
-
- a) have a loop count of loopcount -1
- b) what you need to do is ab' + ab" before you enter the loop...
- c) addx.w d3,d1 after the dbra so you dont do the ab' + ab" operation
-
- So:
- {
- aoverflow = ab' + ab"[iteration]
- LOOP-1:
- 1) aa'[iteration] += aa"[iteration] + aoverflow of (ab' + ab")
- boverflow = bb' + bb"
- 2) bb'[iteration] += bb"[iteration] + boverflow of (bb' + bb")
- aoverflow = ab' + ab"[iteration+1]
- ENDLOOP:
- 1) aa'[iteration] += aa"[iteration] + aoverflow of (ab' + ab")
- boverflow = bb' + bb"
- 2) bb'[iteration] += bb"[iteration] + boverflow of (bb' + bb")
- }
-
- that translate too:
-
- ...
- move.l d3,d4
- sub.w d4,d4
- add.l d0,d1 ; only add decimal part
- .Loop addx.l d2,d0
- addx.l d3,d1
- dbra d7,.Loop
- addx.l d2,d0
- addx.w d3,d1 ; only add integer part
- ...
-
- if you only use 8bit integer, you can have 24bit for the decimal point.
- basicly the case when steping 256xY(max256) texture map array.
-
- Stephan
-